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Abstract 

We prove a theorem that reduces bounding the mixing time of a card shuffle to verifying a 
condition that involves only pairs of cards, then we use it to obtain improved bounds for two 
previously studied models. 

E. Thorp introduced the following card shuffling model in 1973. Suppose the number of 
cards n is even. Cut the deck into two equal piles. Drop the first card from the left pile or 
from the right pile according to the outcome of a fair coin flip. Then drop from the other pile. 
Continue this way until both piles are empty. We obtain a mixing time bound of 0(log 4 n). 
Previously, the best known bound was 0(log 29 n) and previous proofs were only valid for n a 
power of 2. 

We also analyze the following model, called the L-reversal chain, introduced by Durrett. 
There are n cards arrayed in a circle. Each step, an interval of cards of length at most L is 
chosen uniformly at random and its order is reversed. Durrett has conjectured that the mixing 
time is 0(max(n, j^)logn). We obtain a bound that is within a factor 0(log 2 n) of this, the 
first bound within a poly log factor of the conjecture. 

1 Introduction 

Card shuffling has a rich history in mathematics, dating back to work of Markov [12] and Poincare 
|16j . A basic problem is to determine the mixing time, i.e., the number of shuffles necessary to mix 
up the deck (sec Section [1.11 for a precise definition). A natural first step (used as far back Borel 
and Cheron [2] in 1940) is to determinine the number of steps necessary to randomize single cards 
and pairs. Clearly this is always a lower bound for the mixing time. On the other hand, it is often 
not far from an upper bound as well; for a number of models of card shuffling (see, e.g., Diaconis 
and Shahshahani [7], Wilson [17] . or Bayer and Diaconis [1]) the the mixing time is only a small 
factor (e.g. 0(1) or O(logn)) larger than the time required to mix pairs. This suggests finding a 
general method that reduces bounding the mixing time (in the global sense that the distribution 
on all n\ permutations is roughly uniform) to verifying a local condition that involves only pairs 
of cards. In this paper, we introduce such a method and use it to analyze two previously studied 
models. In both cases we find an upper bound for the mixing time that is within a poly logarithmic 
factor of optimal. 

We study card shuffles that can be viewed as generalizations of three card Monte. In three card 
Monte, the cards are spread out face down on a table. In one step, the dealer chooses two cards, 
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puts them together and then separates them quickly so that an observer cannot tell which is which. 
We call this operation a collision, and model it mathematically as a random permutation that is 
an even mixture of a transposition and the identity. We prove a general theorem that applies to 
any method of shuffling that uses collisions. The theorem bounds the change in relative entropy 
after many steps of the chain, based on something that is related to the interactions between pairs 
of cards. Next we use the theorem to analyze two card shuffling models, the Thorp shuffle and 
Durrett's L-reversal model. 

1.1 Applications 

In this section we describe two applications of our main theorem. First, we give a formal definition 
of the mixing time. Let p(x, y) be transition probabilities for a Markov chain on a finite state space 
V with a uniform stationary distribution. For probability measures \x and v on V, define the total 
variation distance — u\ \ = J2xeV \^( x ) — v { x )\i an d define the mixing time 



where IA denotes the uniform distribution. 

Our first application is the Thorp shuffle, which is defined as follows. Assume that the number 
of cards, n, is even. Cut the deck into two equal piles. Drop the first card from the left pile or the 
right pile according to the outcome of a fair coin flip; then drop from the other pile. Continue this 
way, with independent coin flips deciding whether to drop left-right or right-left each time, 
until both piles are empty. 

The Thorp shuffle, despite its simple description, has been hard to analyze. Determining its 
mixing time has been called the "longest-standing open card shuffling problem" |5J. In [15] the 
author obtained the first poly log upper bound, proving a bound of 0(log 44 n), valid when n is a 
power of 2. Montenegro and Tetali |14j built on this to get a bound of 0(log 29 n). In the present 
paper, we dispense with the power-of-two assumption and get an improved bound of 0(log n). 



We also analyze a Markov chain that was introduced by Durrett [9] as a model for evolution 
of a genome (see [10J). In the L-reversal chain there are two parameters, n and L. The cards are 
located at the vertices of an ra-cycle, which we label 0, . . . , n — 1. Each step, a (nonempty) interval 
of cards of length at most L is chosen uniformly at random and its order is reversed. By the coupon 
collector problem, 0(n log n) steps are needed to break adjacencies between neighboring pairs. Fur- 
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thermore, the mixing time for a single card is on the order jp, because each step the probability 
that a particular card moves is on the order of L/n and each time a card moves it performs a step 
of a symmetric random walk with typical displacement on the order L. These considerations led 
Durrett to the following conjecture. 

Conjecture (Durrett). The mixing time for the L-reversal chain is 0(max(n, -p-) logn). 

In [9], Durrett proves the corresponding lower bound using Wilson's technique [T7j based on eigen- 
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functions. The spectral gap was determined to be within constant factors of max(n, j?) by Cancrini, 
Caputo and Martinelli [3]. The best previously-known bound for the mixing time, which could be 
obtained by applying standard comparison techniques, was within a factor 0(n 2 / 3 ) of the Durrett's 
conjecture in the worst case. 

Durrett's conjecture has presented a challenge to existing techniques. As shown by Martinelli 
et al, the log Sobolev constant does not give the conjectured mixing time. Furthermore, the mixing 



T mix = min{n : ||p n (x, •) - U\ \ < \ for all x £ V} 



(1) 
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time in L 2 (denned by replacing total variation distance by an appropriate 1? distance in equation 
(|12p ) can be nearly n 1 / 3 times the conjecture, as the following example shows. Let L = n 2 / 3 , so 
that the conjectured mixing time is 0(n log n). We claim that in this case the 1? mixing time is 
at least cn 4 / 3 for a constant c. Let A be the event that cards 1, . . . , n/2 occupy positions 1, . . . n/2 
in any order. If the initial ordering is the identity permutation, then after t shuffles we have 

P(yl) > P(none of the reversed intervals contained cards 1 or n/2) 

/ 2L\t 
> (l ) , 

which is much larger than ( n ™ 2 ) 1 unless t > cn 4 "/ 3 for a constant c. Since mixing in L 2 implies 

convergence of transition probabilities, the 1? mixing time is at least on the order of n 4 / 3 , which 
is higher than the conjecture. This means that in order to prove the conjectured bound on the 
mixing time in total variation, one cannot use any method for bounding mixing times that gives a 
bound in L 2 . 

In the present paper, we prove that the mixing time is o(^(n V ^-)log 3 n^. This is the first 
upper bound that is within a poly log factor of the conjecture. 

The remainder of this paper is organized as follows. In Section [2] we give some necessary 
background on entropy and prove some elementary inequalities. In Section [3] we define Monte 
shuffles, the general model of card shuffling to which our main theorem will apply. In Section 2] we 
prove the main theorem. In Section [5] we analyze the Thorp shuffle and in Section [6] we analyze 
the L-reversal chain. 

2 Background 

For a probability distribution {pi : i G V}, define the (relative) entropy of p by ENT(p) = 
J2i£V Pi log(| V|pi), where we define OlogO = 0. The following well-known inequality links rela- 
tive entropy to total variation distance. Let IA denote the uniform distribution over V. Then 

\\p-U\\ < V^ENTCp). (2) 

If X is a random variable (or random permutation) taking finitely many values, define ENT(X) 
as the relative entropy of the distribution of X. Note that if ~P(X = i) = Pi for i £ V then 
ENT(X) = E(log(|y|px))- We shall think of the distribution of a random permutation in S n as a 
sequence of probabilities of length n\, indexed by permutations in S n . If T is a sigma-field, then we 
shall write ENT(X | T) for the relative entropy of the conditional distribution of X given T ' ■ Note 
that ENT(X | J 7 ) is a random variable. If tt is a random permutation in S n , then for 1 < k < n, 
define T k = o-{-K~ l {k), . . .,ix~ l {n)), and define ENT(vr, k) = ENT^" 1 ^) | F k+l ) (where we think 
of the conditional distribution of TT^ 1 (k) given J~k+i as being a sequence of length k). The standard 
entropy chain rule (see, e.g., pf|) gives the following proposition. 

Proposition 1 For any i < n we have 

n 

ENT(tt) = e(eNT(vt I Fifj + E(ENT(tt, k)). 

k=i 
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To compute the relative entropy in first term on the right hand side, we think of the distribution 
of 7r given T{ as a sequence of probabilities of length (i — 1)!. 

Remark: Substituting i = 1 into the formula gives ENT(vr) = Y2=i E(ENT(vr, k)). □ 

If we think of n as representing the order of a deck of cards, with ir(i) = location of card i, 
then this allows us to think of E(ENT(7r, k)) as the portion of the overall entropy ENT(7r) that 
is attributable to the location k. If S C {1, . . . , n} is a set of positions then we shall refer to the 
quantity J2k&s ENT(-7r, k) as the entropy that is attributable to S. 

Definition 2 For p,q > 0, define d(p, q) = ^plogp + ^qlogq — £±2 \ g^E+l^j _ 

We will need the following proposition. 

Proposition 3 Fix p > 0. The function d(p, •) is convex. 

Proof: A calculation shows that the second derivative is positive. □ 



Observe that d(p, q) > 0, with equality iff p = q by the strict convexity of the function x — > xlogx. 
Furthermore, some calculations give 



dip, q) = 




where /(A) = \{l + A) log(l + A) + |(1 - A) log(l - A). If p = {pi : i G V} and q = : i € V} 
are both probability distributions on V, then we can define the "distance" d(p, q) between p and q, 
by d(p,q) = J2iev d(Pi,Qi)- (We use the term distance loosely and don't claim that •) satisfies 
the triangle inequality.) Note that d{p,q) is the difference between the average of the entropies of 
p and q and the entropy of the average (i.e. an even mixture) of p and q. 
We will use the following projection lemma. 

Lemma 4 Let X and Y be random variables with distributions p and q, respectively. Fix a function 
g and let P and Q be the distributions of g(X) and g(Y), respectively. Then dip,q) > d(P,Q). 



Proof: Let Si = {x : g(x) = i}. Then 



We have 



d(p,q) = YY d (Px,Qx) 

i xeSi 

Px + q x , fPx ~ Qx 



EE 

i xtSi 



Pi + Q, 



E 

x&Si 



Px + Qx 
Px +Qx 



Pi + Qi 



Px - gq 

Px + qi 



(4) 
(5) 
(6) 
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Note that / has a positive second derivative, hence is is convex. Thus by Jensen's inequality, the 
quantity ([6]) is at least 



Px + q x 



Pi + Qi 



- l Px - Qx 
Px + Qx 



Pi ~\~ Qi "In / P% Q? 



2 J J VPi + Q, 

i 

d(P,Q). 



(7) 

(8) 
(9) 



Let U denote the uniform distribution on V. Note that if n is an arbitrary distribution on V, then 
ENT(^) and d{[i,lA) are both notions of a distance from /u to IA. The following lemma relates the 
two. 

Lemma 5 For any distribution fj, on V we have 
for a universal constant c > 0. 

Proof: Let n = |V|, define fi, = n[i and define g : (0,oo) — > R by g(x) = xlogx — (x — 1). Then 

ENT(/x) = ^M(*)log(nM(*)) (10) 
1 



n . , 



£/2(i)log/t(i)-(0(i)-l) (11) 

= -Es(^)). (12) 
n ;ev 

where the second equality holds because X)iev(AW — 1) = 0. Thus it's enough to show for a 
universal constant c we have 

£) > *G*(0), (13) 

n log n 

for all i £ V. Fix i £ V and let x = Then by equation ([3]) we have 

dOx(i),£) = -d(x,l) (14) 
n 



where /(A) = ^(1 + A) log(l + A) + ^(1 — A) log(l — A). Thus it remains to show that the function 
R(x) defined by 

R(x) = (16) 

2 J \ x+1 
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is at most c~ 1 logn on the interval [0, n], for a constant c > 0. Note that R(x) is bounded on 
the interval [0,2]. (This can be seen by applying L'Hopital's rule twice for the point x = 1.) Let 
x 6 [2, n]. The denominator in (|16p is at least 

!'(stt) 2 f /( '»' 

since the function x — > /(a; — 1/x + 1) is increasing on [2, oo). The numerator is g{x) < x\ogx < 
xlogn. Thus R(x) < 21ogn//(|) on the interval [2, n] and the proof is complete. □ 



3 General set-up: card shuffles with collisions 
3.1 Collisions 

We shall now define a collision, which is the basic ingredient in all of the card shuffles analyzed in 
the present paper. If tt is a random permutation in S n such that 




id with probability ^; 
(a, b) with probability i, 



for some a,b £ {1,2, ... ,n} (where we write id for the identity permutation and (a, b) for the 
transposition of a and b), then we will call tt a collision. If tt and \x are permutations in S n , then 
we write tt/j, for the composition [i o tt. 

A card shuffle can be described as a random permutation chosen from a certain probability 
distribution. If we start with the identity permutation and each shuffle has the distribution of tt, 
then after t steps the cards are distributed like tti • • • 7r$, where the iri are i.i.d. copies of tt. In this 
paper, we shall consider shuffling permutations n that can be written in the form 

tt = vc(ai, bi)c(a 2 ,b 2 ) ■ ■ ■ c(a k ,b k ), (17) 

where v is an arbitrary random permutation, the numbers a\, . . . ,a k ,b\, . . . ,b k are disjoint, and 
c(cij , bj ) is a collision of a,j and bj . The values of a,j and bj and the number of collisions (which can 
be zero) may depend on v, but conditional on v the c{a,j,bj) are independent collisions. We shall 
call shuffles of this type Monte. 
For t > 1, define iru\ = tx\ ■ ■ ■ irt- 



3.2 Warm-Up Lemma 

In this section we prove a simple lemma with a short proof that brings out many of the central 
ideas of our main theorem (Theorem [9] below). We start with an easy proposition. 

Proposition 6 Suppose that tt is any fixed permutation. Then 

ENTGuvr) =ENT(//). 

Proof: Up to a re-labeling of indices, the random permutation [itt has the same distribution as 
jj,, hence the same relative entropy. □ 
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If 7r is random and independent of [i then ENT(/i7r) < ENT(^), which follows by conditioning on 
7r, applying Proposition [6j and then applying Jensen's inequality to the function x — > xlogx. It 
follows that if vri,7T2, . . . are i.i.d. copies of it then ENT(7Ti • • - irk) is nonincreasing in k. In this 
section we study the decay of entropy ENT(/i7r) — ENT(//) in the case where the permutation ir is 
a collision. 

The following lemma relates to the case where tt is a collision between the jth card and an- 
other card of smaller index. The lemma says that the relative entropy is reduced by at least 
cENT(/x, j)/logn, on average (where "on average" means with respect to the different possible 
choices of indices i < j). 

Lemma 7 Let /j, be a random permutation. Then for a universal constant c we have 
j -1 E ENT(^c(i, j)) < ENT(ju) — cENT(/x, j)/ log n. 

Proof: Using the abuse of notation ^7Ti + hiT2 for a random permutation whose distribution is an 
even mixture of the distributions of 7Ti and 7T2, we have 

Hc(i,j) = 

Let C(X j T) denote the conditional distribution of random variable (or random permutation) X 
given the sigma field T. Let ju = [J>(i,j) (i.e., the product of fi and the transposition Note 
that ju and /i are the same, except that ji~ l {i) = fJ> (J) and /u _1 (i) = and recall that i < j. 

It follows that ENT(ju | = ENT(^ | and hence ENT(//c(i, j) \ - ENT(,u \ = 

—d(C(p, I Fj+i), £(// I Tj+i)). But by the projection lemma, 

d(c@\r j+1 ),c(jM\r j+1 )) > d{c{r\3)\^ j+ xlc{^\j)\T j+1 )) 

= d{£{^ l (i)\F J+1 ),C{^ l (3)\^ + i))- 

Hence 

J 1 E ENT( M c(z,j) I - ENT(^ | T 3+1 ) = -j" 1 E Wm^) I ^f+i), I ^i+i)) 



< _^ ENT(£(/i~ 1 (j) I JF (18) 

where the first inequality is by Proposition [3] and the second is by Lemma [5l Here IA denotes the 
uniform distribution over {1, . . . , n} — {fi (j + 1), . . . , /x _1 (n)}. Taking expectations gives 

r 1 EE(ENT(^c(i,i)|^ +1 ))-E(ENT(^|^ +1 )) < — — ^ — ENT(/x, j). (19) 

Since ENT(/x, fc) = ENT(/xc(i, j), k) for all A; > j + 1, Proposition Q] and equation (fT9j) yield the 
lemma. □ 
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4 Main Theorem 

Let 7r be a random permutation in S n that is Monte (i.e., can be written in the form (|17p ) and let 
7ri, 7T2, . . . be independent copies of it. For £ > 1 let ttm = tt\ ■ ■ ■ TT t . 

Convention. We shall use the following convention throughout. For integers x with 1 < x < n, 
we denote by card x the card initially in position x. 

For cards x and y, say that x collides with y at time m if for some i and j we have ^(m)^ 1 ^) = x, 
7r( m ) _1 (j) = y, and 7r m has a collision of i and j. 
We will need the following definition. 

Definition 8 For a random variable X , a finite set S and a real number A £ [0, 1], say that the 
distribution of X is A-uniform over S if 

P(X = i) > A\S\-\ 

for all i £ S. 

Remark: If A < 1 then the distribution of X need not be concentrated on S. (But if A = 1, then 
X is uniform over S.) □ 

Our main theorem is a generalization of Lemma [7J It generalizes from a collision to an arbitrary 
Monte shuffle, and it bounds the loss in relative entropy after many steps. 

Theorem 9 Let tt be a Monte shuffle on n cards. Fix an integer t > and suppose that T is a 
random variable taking values in {1, . . . , i}, which is independent of the shuffles {iti : i > 0}. For a 
card x, let b(x) denote the first card to collide with x after time T (or b{x) = x if there is no such 
card). Define the match m(x) of x by 



m(x) :- 



b(x) if x = b(b(x)); 
x otherwise. 



Suppose that for every card i there is a constant Ai £ [0, 1] such that the distribution of m(i) 
is Ai-uniform over {1, ...,«}. Let [i be an arbitrary random permutation that is independent of 
{iTi : i > 0}. Then 

ENT( M 7r (t) ) - ENT(m) < f^- J2 A k E k , 

io & n k=i 

where = E(ENT(//, k)) and C is a universal constant. 

Proof: Let M = (m(i) : 1 < i < n). For i and j with j < i, let c(i,j) be a collision of i and j. 
Assume that all of the c(i,j) are independent of /i, 7tm and each other. Note that 



[ ]J c{i,m{i)) 

i:m(i)<i 



TT, 



has the same distribution as tt^, so it is enough to bound the relative entropy of the distribution 
°f M Y\i:m(i)<i c {h m {^)) ^(t)- By expressing this as a mixture of conditional distributions given M. 
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and 7T( t ), and then using Jensen's inequality applied to x — > xlogx, the entropy can be bounded 
above by the expected value of 

ENT (4 II <iMi))]*®\ M > n w) = ENT(//[ I] c(i,m(i))]\M,iT {t) ) (20) 

i:m(i)<i i:m(i)<i 

= ENT(^[ n c(i,m(i))] \M), (21) 

i:m(i)<i 

where the first equality holds by Proposition [6] and the second equality holds because the permu- 
tation /x, the product of collisions c(i,m(i)) and ttm are conditionally independent given Ai. For 
1 < k < n, let 

vk= I| c(i,ra(i)). 

i:m(i)<i<k 

Note that the right hand side of (|21f) is ENT(//i/ n |.M) and fo = id. Since n is independent of Ai, 
we have ENT(/i | M) = ENT(ju) and hence 

n 

ENT(/^ n | M) - ENT(^) = ^ ENT(/^ | M) - ENT(/^_i | 7W). 

k=l 

Thus, it is enough to show that for every k we have 

E(ENT(/^t I M) - ENT(/ii/ fc _i I M)) < ~ CAkEk , (22) 
V / logn 

Note that if m{k) > k then i/^ = Vk-i- If Jn(fc) < k then ^ = c(fc, m(k)). We can now proceed 
in a way that is analogous to the proof of Lemma [7J Note that 

p,v k = \iiu k -i + \ixv h -x(k,m(k)). 

Fix i < k, let A = \iVk-\ and let A = \(k,i). Note that A and A are the same, except that 
A _1 (A;) = A _1 (i) and A _1 (A;) = A _1 (i). Note also that Vk-\ has k + 1, ...,n as fixed points, so 
(A _1 (£; + 1), . . . , A~ x (n)) = {^~ l (k + 1), • • • , n~ l (n)). Let 

^fc+i = a^-^/c + l),...,^" 1 ^)) 
= ^A-^fe + l),...^- 1 ^)), 

and define Tk+i = cr(^ r fc+i, M.). Then we have ENT(A | Fk+i) = ENT(A | Fk+i) and hence 
ENT(Ac(fc, i) | F k+1 ) - ENT(A | F k+1 ) = -d{C{\ \ T h+1 ),C(X | F k+1 )). 

But by the projection lemma, 

d(c(\\£ k+1 ),C(\\£ k+1 )) > ^(^(A- 1 ^) | ^fe+O^CA- 1 ^) | 



d(£(A- 1 «lA+i),/:(A- 1 (A ; )|^ +1 )). 
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Thus, since m(k) is j^+i-measurable, on the event that m{k) < k we have 

ENT(»v k | h+i) - ENT(^_i | h+i) = ENT(Ac(A;, m(k)) | A+i) - ENT(A | F k+ i) 

< -d(c{\~\m{k))\f k+1 ),C{^ l {k)\F k+ i)) 
= -Y,Hrn(k)=i)d(c( f i-\i)\T k+1 ),C(fi- 1 (k)\^ k+1 

i<k 

where in the third line we replaced A by /i because v k _\ does not contain the collision c(k,m(k)) 
and hence has k and m{k) as fixed points, and we replaced the sigma field T k -\.\ by T k +\ because 
fi is independent of M. Taking expectations gives 



E(ENT( M z/ fe \F k+1 ) - ENTOwfc-i l^fc+i)) < -E(£P(m(*) = i)d(c{^\i) \ F k+l ) , C{pT x (k) \ F k +ij)) 

i<k 

< -^(Akk- 1 d(c{^-\i) | F k+1 ), Ci^Hk) | ^ib+i))) 

i<k 

< -^[A k d(k- l Y,^ 1 ii)\^k + i),C(^-\k)\T k+l ))). (23) 

i<k 

where the second inequality follows by the ^-uniformity of m(k) and the independence of m(k) 
and (j,, and the third inequality is by Proposition [3l The first argument of d(-, •) in the right hand 
side of equation (|23p is the uniform distribution over {1, . . . , n} — {fi^ 1 (k + 1), . . . , /i -1 (n)}. Thus 
the right hand side of (f23j) is 



-A k E l (d{u,C(^ 1 (k)\T k+l ))) (24) 

< _^ E (ENT(£( / x- 1 (fc)|^ + i)) = -^^, (25) 
logn logn 

where the inequality holds by Lemma [5j Since \iv k and [iv k -\ agree in positions k + 1, . . . ,n, 
the portion of their respective entropies that is attributable to those positions coincides, hence 
Proposition Q] and equation (j25|) yield the theorem. □ 

Remark: Since for any distribution p we have d(p,p) = 0, equation (j23H is still true if m{k) is only 
j4fc-uniform over {0, . . . , k — 1}. So the assumptions of the theorem can be relaxed so that there is 
no lower bound necessary on the probability that m{k) = k. □ 



5 Thorp shuffle 

In this section we show that Theorem [9] implies an improved bound for the Thorp shuffle. Recall 
that the Thorp shuffle has the following description. Assume that the number of cards, n, is even. 
Cut the deck into two equal piles. Drop the first card from the left pile or the right pile according to 
the outcome of a fair coin flip; then drop from the other pile. Continue this way, with independent 
coin flips deciding whether to drop left-right or right-left each time, until both piles are 
empty. 

We will actually work with the time reversal of the Thorp shuffle, which clearly has the same 
mixing time. Suppose that we label the positions in the deck 0, 1, . . . , n — 1. Note that the Thorp 
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shuffle can be described in the following way. Each step, for x with < x < ^ — 1, the cards at 
positions x and x + n/2 collide and are moved to positions 2x mod n and 2x + 1 mod n. Thus, 
the time reversal can be described as follows. Each step, for even numbers x £ {0, . . . , n — 2}, the 
cards in positions x and x + 1 collide and are moved to positions x/2 and x/2 + n/2. 

We write 7T( t ) for a product of i i.i.d. copies of the reverse Thorp shuffle. Our main lemma is 
the following. 

Lemma 10 Let t = \ log 2 n \ . There is a universal constant C such that for any random permuta- 
tion fx we have 

ENT(A»7r (t) ) < (1 - C/ log 2 n)ENT(/i). 

Proof: Partition the locations 0, . . . ,n — 1 into intervals I m as follows. Let Jo = {0}, and for 
m = 1,2,..., [log 2 n], define I m = {2 m ~ 1 , . . . ,2 m - 1} D {0, . . . , n - 1}. 

For i £ {0, . . . , n — 1}, define -E^ = ENT(u, i). We can write the entropy of u as 

ENT(u) = E E ^' 
Let m* be the value of m that maximizes J2i^m-^i- Then 

for a constant c. Since the reverse Thorp shuffle is in Monte form, we may use Theorem We 
will also use the remark immediately following Theorem [9j which says that the distribution of the 
card matched with i need only be uniform over {j : j < i} in order for the conclusions of the 
theorem to hold. Fix m with 1 < m < [ log 2 n\ . We will show that the assumptions of the theorem 
hold with t = \ log 2 n \ , 

f 1/4 if i £ I m ; 
I otherwise, 



A, 



and the random variable T defined as follows. Let T be any random variable that satisfies 

P(T = r) > 2 r - m -\ (26) 

for r = 0, . . . , m. 

Fix i G / m . We shall show that for any j < i we have P(m(i) = j) > l/4i. Define / : Z — > Z 
by /(i) = L^/2j. Note that if X s (j) denotes the position of card j at time s, then X s (j) = 
f(X s ^i(j)) + Z s (j), where Z s (j) is a random "offset" whose distribution is uniform over {0, n/2}. 
Note that in step of the shuffle, the distance between a pair of cards is cut roughly in half if they 
have the same offsets. More precisely, if x > y then 

j (x - y)/2 if x is odd or y is even; 
/(x) - }{y) < | {x _ y)/2 + i otherwige _ (27) 

It follows that riog 2 (/0) " < T(log 2 (x - y))l and flog 2 (/(x) - f(y))] < \(log 2 (x - y))] - 1 

unless x = y + 1 and x is even. 
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Say that two positions x and y are neighbors if \x — y\ = 1 and min(x, y) is even. (Note that in 
each step of the reverse Thorp shuffle, the neighbors collide.) Since n is even we can write n/2 = 2 k l 
for some k > and odd integer I. Fix i and j with j < i. We shall show that P(m(i) = j) > 

First, we claim that P(X m (j) is even) > ^. To see this, note that f m (j) = 0, where we write f r 
for the r-fold iterate of /. Hence, if m < k, then X m (j) = Y^T=o Z m ^ r (j). Each of the Z m - r (j) 
is either or 2 k l, so each term in the sum is even. Assume now that m > k. Suppose that the value 
of Z m ^k(j) (which is either or n/2) is determined by an unbiased coin flip. For m — k < s < m, 
let X' s (j) be what the position of card j at time s would have been if the outcome of the coin flip 
determining Z m _k had been different. Since f(x) — f(y) = \{x — y) if x — y is even, it follows that 
\X' s (j) - X s (j)\ = 2 m ~ s l for m - k < s < m. Thus \X' m (j) - X rn {j)\ = I, which is odd. So one of 
X' m (j) and X m (j) is odd and the other is even. Since they have the same distribution, they are 
each even with probability \. 

Let yo = -^o(^)) an d for s > 1 let y s = f(y s -i) + Z s (j), i.e., where card i would be located after 
s steps if its offsets were the same as those for j. Let r = min{s : \y s — X s (j)\ = 1 and X s (j) is 
even}. Since \i — j\ < 2 m equation (|27p and the sentence immediately following it imply that there 
must be a value of s < m such that |y s — X s (j)\ = 1. Combining this with the fact that X m (j) is 
even with probability at least \ gives P(r < m) > ^. Furthermore, given r = r, the conditional 
probability that X s (i) = y s for < s < r (and hence i and j collide at time r) is 2~ r . Finally, since 
assumption ([26]) gives P(T = r) > 2 r - m - 1 , It follows that P(m(i) = j) > 2~ m " 2 > ^. 

We have shown that the assumptions of Theorem ([9]) are met with t = \ log 2 n\ and Ai = 1/4 
for i G I m . Applying this with m = m* shows that for any permutation /j>, we have ENT(/i7T( 4 )) < 
(1 — C/log 2 n)ENT(/i), for a universal constant C. It follows that for any B € {1, 2, . . .} we have 

ENT(vr (Btlog3n) ) < (1- C/log 2 n) B1 °s 3 "ENT(id) 
< n 1 ~ CB logn, 

since ENT(id) = log n! < n log n and 1— u < e~ u for all u. If B is large enough so that n 1 ~ CB log n < 
| for all n, then ENT(7T/ Stlog 3 n )) < | and hence \\ft(Bt\og 3 n) ~ — I ^ equation It follows 
that the mixing time is at most Bt log 3 n = 0(log 4 n). □ 



6 L-reversal chain 

In this section we analyze Durrett's L-reversal chain. Recall that the L-reversal chain has two 
parameters, n and L. The cards are located at the vertices of an n-cycle, which we label {0, . . . , n — 
1}. Each step, a vertex v and a number / G {0, . . . , L} are chosen independently and uniformly at 
random. Then the interval of cards v, v + 1, . . . , v + 1 is reversed, where the numbers are taken mod 
n. Equivalently, each step a (nonempty) interval of length at most L (i.e., of size between 1 and 
L + 1) is chosen uniformly at random and reversed. We shall assume that L > L$ for a suitable 
value of Lo and n > 4L. The cases where L is constant and where n < cL for a constant c were 
both treated in [9]. 

We put the shuffle in Monte form as follows. Let fiij denote the permutation that reverses the 
cards in positions i, i + 1, . . . , j and leaves the rest unchanged. Let Z be uniform over {1, . . . , L}. 
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Choose v uniformly at random from {0, . . . , n — 1} and let 

!H VtV+ L with probability 2 (l+i) ' 

fJ>v,v+L-i with probability 2(L 1 +1 ^ ; (28) 

/j, VjV+z c(v, Z) with probability jpp^- 

Since /jl v ,v+2(v, v + 2) = id and u + 1) = id, it is easily verified that 7r has the distribution 

of an L-reversal shuffle. 

We write ttm for a product of t i.i.d. copies of the L-reversal shuffle. Our main technical lemma 
is the following. 



Lemma 11 There is a universal constant C such that for any random permutation fi there is a 

Crv 



value of t € {1, ... , ^nr} such that 



ENT( M 7r (i) ) < (1 - /(*))ENT(m), 
where f(t) = Iog 2 n A l^j, /or a universal constant 7. 

Before proving Lemma HU we first show how it gives the claimed mixing time bound. 
Lemma 12 The mixing time for the L-reversal chain is 0\(n V log 3 . 

Proof: Let t and / be as defined in Lemma [TTJ Then 

1 7~ 1 (log 2 n)t(— V l) = 7 _1 log 2 n(n Vt) <T, (29) 



/(*) 

where T = 7 log 2 n[n V ^]. Note that l/T is a bound on the long run rate of entropy loss per 
unit of time. Lemma [TT1 implies that there is a t\ £ {1, . . . , 33-} such that 

ENT(tti ■ • • 7r tl ) < (1 - /(ii))ENT(id), 
and a t?, G {1, . . . , ^rg-} such that 

ENT(vr 1 • • • vr tl+t2 ) < (1 - /(t 2 ))ENT( 7 r 1 • • • ir tl ), 
etc. Continue this way to define £3, £4, and so on. For j > 1 let tj = Y%=i U- Then 

3 



ENT(7r (T .)) < [n(l-/(* i ))]ENT(id) (30) 

i=i 

j 

< exp(-£/(i,))ENT(id). (31) 



i=l 

But since tj < Tf(tj) by equation (|29|) . we have 

j j 



i=l i=0 
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It follows that 



ENT(vr (Tj) ) < exp(-?)ENT(id). (32) 



< 



Since ENT(id) = logn! < re logn, it follows that if tj > Tlog(8nlogre) we have ENT(7r( T )) 
and hence IK(tj) ~^\\ — 4 by equation (J2J) . It follows that the mixing time is 0(Tlog(8nlogn)) 
0((nV log 3 n). 
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We shall now prove Lemma [TT1 

Proof of Lemma [XT} Let m = [~log 2 (n/L)] . Then we can partition the set of locations {0, . . . ,n— 
1} into m + 1 intervals as follows. Let Jo = {0, . . . , L}, and for 1 < k < m define 1^ = {2 k ~ 1 L + 
1, . . . , 2 k L} n {0, . . . , n - 1}. Define E k = E(ENT(/i, k)). Note that we can write the entropy of /x 
as 

m 

ENT(/x) = E E ^ • ( 33 ) 
fc=0je/ fc 

Thus, if A;* maximizes X^g/, then 

S ^ 2 ^i-j-ENT(fi). 

Suppose first that /c* = 0. Then we can take t = 1. Let 7r be a random permutation corresponding 
to one move of the L-reversal chain. Let E be the event that ir reverses a, a + 1, • • • , b for a,i 6 
{0, . . . ,L}. Then (using an abuse of notation similar to that in Section f3.2|) we can write 7r as 

7T = OilTi + (1 — Ct)lT2, 

where a = P(E), tt\ is ir conditioned on E, and 1:2 is ir conditioned on E c . Then fiir = afiiri + 
(1 — a)fi7T2 and hence 

ENT(/ct) = ENT(a/ivri + (1 - a)/i7r 2 ) (34) 

< aENTOuTri) + (1 - a)ENT( / uvr 2 ) (35) 

< aENT(/ivri) + (1 - a)ENT(//), (36) 

where both inequalities follow from the convexity of x — > xlogx. It follows that 

ENT(//tt) - ENT(/x) < a [ENT(^) - ENT( / u)l . (37) 

Note that %\ does not move any of the cards in locations {L + 1, . . . , n}. Hence by Proposition Q3 
the entropy difference ENT(/x7Ti) — ENT(//) is the expected loss in entropy attributable to positions 
{0, . . . , L}, i.e., E(ENT(//7n | T L+1 )-ENT(fi \ T L+1 )) , where F L+1 = (r(jr\L+i), . . . , ^ l {n-l)). 
The permutation ir\ is a step of a modified L-reversal chain on the L + 1 cards in the line graph 
{0, . . . , L}, reversing an interval of the form a, a + 1, . . . , b for < a < b < L. 

In Theorem 6 of [9], it is shown (by comparison with shuffling through random transpositions 
[7]; see [6] for background on comparison techniques) that the log Sobolev constant for the L- 
reversal chain on n cards is at most B|j logn for a constant B. This remains true if we consider 
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the modified L-reversal process on the line graph. Thus ir\ has a log Sobolev constant that is at 
most 2BLlogL, and hence (by the well-known relationship between the log Sobolev constant and 
decay of relative entropy; see, e.g., [13]) multiplying fx by -k\ reduces the relative entropy by at 
least l/B'L\ogL times the entropy attributable to positions {0,... , L}, for a constant B' . Thus 
the right hand side of ([37j) is at most 

- a(B'LlogL)- 1 E j < -{%B'n\ogL)- 1 ^ Ej (38) 
jeh ie/i 

= -(8 J B , nlog 2 n)" 1 ENT(/i), (39) 

where the second line follows from the fact that a > 

Next we shall consider the case where k* > 1, so that the interval is of the form {2 k ~ 1 L + 
1, . . . , 2 k L} Pi {0, 1 . . . , n — 1}. We will use Theorem [9] to get a decay of entropy in this case. We 
make the following claim. 

Claim 13 Fix k > 1. There are universal constants C and a > such that if t = A k Cn/L s , 
T = t/2 and 

A = [ «(£ A1 ) ify^h; 

y | otherwise, 
then the assumptions of Theorem^ are satisfied by t,T, and the A y . 

In order to prove this claim, it is helpful to know that the L-reversal chain enjoys certain mono- 
tonicity properties. Roughly speaking, the closer two cards are together, the more likely they are to 
collide after a given number of steps. Before proving Claim [TBI we shall verify these monotonicity 
properties. 

Two types of monotonocity. Fix x and y in {0, . . . , n} and let x m and y m denote the po- 
sitions of cards x and y, respectively, at time m. Define Z m = \x m — y m \, i.e., the graph distance 
between x m and y m in the n-cycle. Note that Z m is a Markov chain. We shall need the following 
lemma. 

Lemma 14 Let P denote the transition matrix of Z m . Then P is monotone, i.e., if b > a then 
P(b,-)^P( a i')> where >z denotes stochastic domination. 

Proof: Fix positions u and a with a < n/2, and let N(a,u) denote the number of legal intervals 
(i.e., intervals of length at most L) that move the card in position a to position u without moving 
the card in position 0. Then 



N(a,u) 



min(n, (L — a + u) + lj ) if u < a; 
min(a, [5 (L — u + a) + lj ) if u > a. 



(Recall that W6 £LSSUII16 tllclt Tl ^ 4-L.) SupposG that \x>wi — 2/m| — 

For u < n/2, let M(a,u) 

denote the number of legal intervals whose reversal at time m would make |x m +i — y m +i\ = u. If 
a 7^ u then M(a, u) counts intervals that move x but not y and intervals that move y but not x. 
Thus we have M(a, u) = 2(N(a, u) + N(a, n — u)). It is easily verified that M(a, u) is nonincreasing 
in a for u < a < n/2 and nondecreasing in a for < a < u. It follows that Z m is monotone. □ 



15 



We now prove that Z m has another type of monotonicity property. Note that in each move of 
the L-reversal process, there are exactly four cards that are adjacent to a different pair of cards 
after the move than they were before. We say that those cards are cut and write, e.g., "card i is 
cut at time m". We say that a location is cut if the card in that location is cut. 

The cut-stopped process. It will be convenient to consider a modified version Z' m of Z m , 
where we introduce two absorbing states and oo, and have the following occur when either x or 
y is cut. If x and y are within a distance L of each other, then Z' m transitions to 0; otherwise, it 
transitions to oo. 

We shall call this modified process the cut-stopped process. We can impose an order on the state 
space of {Z' m : m > 0} based on the order of the positive integers, with the additional states and 
oo as the minimum and maximum states, respectively. 

Our next lemma says that the cut-stopped process Z' m is monotone with respect to this order. 

Lemma 15 The cut-stopped process is monotone. 

Proof: The proof is a slight modification of the proof of Lemma [TU Suppose that Z' m = z. Note 
that the probability of absorbing in in the next step is a nonincreasing function of z, and the 
probability of absorbing in oo in the next step is a nondecreasing function of z. The rest of the 
argument is almost identical to the proof of Lemma [TU Fix positions u and a with a < n/2, and 
let N'(a, u) denote the number of intervals of length at most L that move the card in position a to 
position u, but neither move the card in position 0, cut position 0, nor cut position a. Then 



N'(a,u) 



min(0, u — 2, |_| (L — a + u)\ ) if u < a; 
min(0, a — 2, (L — u + a) J ) if u < a. 



Suppose that \x m — y m \ = a. For u < n/2, let M'(a,u) denote the number of legal intervals that 
don't cut x or y and whose reversal at time m would make |x m +i — y m +i\ = u. If a ^ u then 
M'(a,u) = 2{N'{a,u) + N'{a,n — u)). It is easily verified that M'(a,u) is nonincreasing in a for 
u < a < n/2 and nondecreasing in a for < a < u. It follows that Z' m is monotone. □ 

We are now ready to prove Claim [T3l For the convenience of the reader, we state the claim again. 
Recall that I k = {2 k ~ l L + 1, . . . , 2 fc L} n {0, . . . , n - 1}. 

Claim [T3l There are universal constants C and a > such that ift = 4 k Cn/L 3 , T = t/2 and 



Ay 



a(|Al) ify^h; 

otherwise, 



then the assumptions of Theorem^ are satisfied by t,T, and the A y . 

Proof: Let y £ 1^. We need to show that if x < y, then with probability at least A y , cards x and 
y collide between time T and time t, and this is the first collision that either is involved in after 
time T. 

Fix y £ Ik and x with x < y. Let r be the first time after time T that either x or y is cut. Note 
that if x and y collide at time r and r < t then m(x) = y. Thus, given that \x T — y T \ < L and 
t < t the conditional probability that m(x) = y is at least 1/8L. This is because the number of 
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intervals that cut either x or y is at most 4L, so the conditional probability that x and y are at the 
endpoints of the interval that is reversed at time r is at least 1/4L. The conditional probability 
that x and y collide is at least half of this. 

Thus it is enough to show that for a universal constant a we have 

P(\x T -y T \ <L,T<t)>a(-M)L/y. (40) 

n 

For m > let Z m = \x m — y m \. Let (3 > be a constant and suppose that L > 2(3. We claim 
that with probability bounded away from we have \Z m \ < (3L for some m < T. To see this, let 
M = min{m : Z m < L}. First, we will show that with probability bounded away from zero we have 
M < T' , where T" = T/2. Suppose that Zq > L. Let X be a random variable with the distribution 
of Z\ — Zq and let X±,X2, ... be i.i.d. copies of X. Note that the random variable Zx' — Zq can 

rpf 

be coupled with the Xj in such a way that Zj>i — Zq < J2i=i X on the event that M > T . It 
follows that P(M < T") > P(X)i<T' -^i — ~^o)- But since when X is nonzero (which happens with 
probability on the order of L/n) it has a typical value on the order of L, it has second and third 
moments satisfying a 2 > C2-L 3 /n and p < CzL 4 /n, respectively. Berry Esseen bounds (see, e.g., 
[8]) imply that for a universal constant Cb we have 

|F T ,(x)-$(x)|<-^L<^, (41) 

where Ft> is the cumulative distribution function (cdf) of J2i<T' X, $ is the standard normal 
cdf, C is a constant that incorporates C2,C3 and C&, and C is the constant appearing in the 
definition of t. For the final inequality we use the fact that t = 4T" is within constant factors of 
Cy 2 n/L 3 , since y £ Ik- 

Since y > L, the quantity (|4ip can be made arbitrarily close to zero for sufficiently large C. It 
follows that 2~2i<T> %i i s roughly normal with standard deviation a large constant times y, hence is 
less than — Zq with probability bounded away from zero. (Recall that Zq = y — x < y). It follows 
that with probability bounded away from zero we have Z m < L for some m < T/2. Now note that 
if x and y are within distance L then given that one of them moves in the next step, the conditional 
probability that they are brought to within a distance f3L is bounded away from zero. Since t is 
much larger than n/L, there is probability bounded away from zero that either x or y is moved 
between time m and m + T/2. This verifies the claim. 

The above claim and the strong Markov property imply that in order to show (|40p . it is enough 
to show that if \i — j\ < PL, m! < T/2 and r is the first time that i or j is cut after time m', then 
for a universal constant a > we have P(|v — jV| < L,t < m' + 1/2) > a(^ A l)L/y. 

For every pair of cards % and j, let T(i,j) be the first time that either i or j is cut after time 
m! . Define t' = min(i/2,n). Let A(i,j) be the event that T(i,j) <m' + t' and at time T(i,j) the 
distance between i and j is most L. Let f(i,j) = P(A(i,j)). Since t' < t/2, it is enough to prove 
that if \i — j\ < PL then 

f{i,j)>a{-M)L/y. (42) 
n 

Since the probability that either i or j is involved in a cut on any given step is at most 8/n, we 
have 

/(» ) j)<min(l,8t7n). (43) 
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Also, note that 

m'+t' 

P( T(i,j) > l, \ii — ji\ < L, either i or j is cut at time /) 

l=m'+l i,j 
t' 

E E g(u,v,k), 

k=l u<v 
\u-v\<L 

where g(u, v, k) is the probability that cards in locations u and v are cut at time m' + k, but neither 
had been cut since time m! . Since the L-reversal process is symmetric it is its own time-reversal. 
Thus, g(u, v, k) is the probability that either location u or v is cut in the first move, but neither 
the card in location u at time 1 nor the card in location v at time 1 is is cut in the next k — 1 

i ( ■ -s\*' _1 

moves. This probability is at least -( J • Since there are nL such pairs (u, v), summing over 
u, v and k gives 

v;/(m) > endr^y- 1 

^— ' n V n > 

1,3 

> c'W, 

for a universal constant c', where the second inequality holds because t' < n. It follows that for 
any i we have 

E/(^') = ^E/(*>i)> c, ^7n- ( 44 ) 

j i,j 

Let g(i,j) = P(A(i,j) n B(i,j)) where B(i,j) is the event that at no time before time T(i,j) was 
the distance between i and j greater than Dy, where the constant D is to be specified below. Note 
that 

52g(iJ) > Y,f(iJ)-P(A(i,j)nB c (i,j)), (45) 
j j 

where B c (i,j) denotes the complement of B(i,j). We claim that J2j dihj) ^ cLt'/n for a universal 
constant c. To see this, fix a card i and k < t' and say that a card u is bad if \io — uo| < L, and 
max < r < m ' + fc W — u r \ > Dy. Since the L-reversal process is symmetric, and the probability that i 
or u is cut in any given step is at most 8/n, we have 

J2P(A(i,j) n B c (i, j) n [T(i,j) =m' + k}) < ^E(B), (46) 

3 

where B is the number of bad cards. Let u be a card initially within distance L of card i. If 
u m is the position of card u at time m, then we can write u m = u + W\ + ■ ■ ■ W m (mod n), 
where Wj £ {— L, . . . , L] is the displacement of card u at time j. Define u' m = u + W\ + • • • W m 
(i.e., like u m , but without the mod n), with a similar definition for z^. Then is a symmetric 
random walk on the integers. Each step there is a jump with probability on the order of L/n 
and the sizes of jumps are at most L. It follows that for sufficiently large A, the probability that 
maxi< m <fc \u' m — u'\ > A{^ L ) 1 I 2 L can be made arbitrarily close to zero. Since k is at most a 

constant times we have A(^) 1 / 2 L < A'y for a constant A'. A similar argument applies to i' m . 



i,3 
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Finally, since \i m — u m \ < \i' m — u' m \ (where the first | • | refers to distance in the n-cycle), it follows 
that for any e > 0, if D is large enough then P(maxi< m <fc \i m — u m \ > Dy) < e. Thus, since there 
are at most 2L cards initially within a distance L of card i, we have E(B) < 2Le. Hence, summing 
equation (l46j) over k < t' gives 

^P(i(i,i)ns c (i,i)) < WLet'/n . (47) 
j 

Combining this with equations (|45f) and (j44H gives 

> cLt'/n, (48) 

3 

for a constant c, if e is small enough. We now define /3 to be a constant smaller than c/32. Since for 
any j we have g(i,j) < f(i,j) < St' /n (by equation (@2j)), we have T,j-.\i-j\</3L9(hj) < Wf3Lt'/n < 
cLt'/2n, and hence 

^ g{i,j)>cLt'/2n, 
j\>0L 

by equation (j4"8j) . Since g(i, j) = for |j — i| > Dy, the average value of g(i, j), where j ranges over 
values such that /3L < \i — j| < Dy, must be at least cLt' / ADyn > aL(j^ A l)/y, for a constant a. 
Since both Z m and the cut-stopped process Z' m are monotone by Lemmas [T4] and [JSJ the function 
g(i,j) is nonincreasing in \i — j\. It follows that g(i,j) > a L(^ A l)/y if |i — j| < f3L. Since g < /, 
this verifies equation (|42p , which completes the proof of Claim [T3l □ 

Using Claim [13] with k = k* and applying Theorem [9] gives 

ENT( M vr (t) ) - ENT(/i) < r^f- A l)ENT(^), 
v ; log n v n / 

for a universal constant C, and the proof of Lemma [11] is complete. □ 

Acknowledgments. I am grateful to A. Soshnikov for many valuable conversations during the 
early stages of this work. 
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